Performance Improvement Of Bengali Text Compression Using Transliteration And Huffman Principle
نویسندگان
چکیده
In this paper, we propose a new compression technique based on transliteration of Bengali text to English. Compared to Bengali, English is a less symbolic language. Thus transliteration of Bengali text to English reduces the number of characters to be coded. Huffman coding is well known for producing optimal compression. When Huffman principal is applied on transliterated text significant performance improvement is achieved in terms of decoding speed and space requirement compared to Unicode compression.
منابع مشابه
Leveraging Statistical Transliteration for Dictionary-Based English-Bengali CLIR of OCR'd Text
This paper describes experiments with transliteration of out-of-vocabulary English terms into Bengali to improve the effectiveness of English-Bengali Cross-Language Information Retrieval. We use a statistical translation model as a basis for transliteration, and present evaluation results on the FIRE 2011 RISOT Bengali test collection. Incorporating transliteration is shown to substantially and...
متن کاملAn Enhanced Static Data Compression Scheme Of Bengali Short Message
This paper concerns a modified approach of compressing Short Bengali Text Message for small devices. The prime objective of this research technique is to establish a lowcomplexity compression scheme suitable for small devices having small memory and relatively lower processing speed. The basic aim is not to compress text of any size up to its maximum level without having any constraint on space...
متن کاملRevisiting Automatic Transliteration Problem for Code-Mixed Romanized Indian Social Media Text
Although automatic Transliteration for Indian languages is a well studied paradigm, but availab le t ransliteration techniques fail in the Indian social media context due to phenomena such as wordplay, creative spelling, codemixing, and phonetic romanized typing; all implying that transliteration for Indian social media text has to be revisited. The paper reports an init ial study on automatic ...
متن کاملAn Effective Approach for Compression of Bengali Text
In this paper, we propose an effective and efficient approach for compressing Bengali Text. This paper focuses on a methodical study on Bengali text compression techniques. The main target of this research is to provide a framework for Bengali text compression; which ensures a simple and computationally inexpensive effective scheme for Bengali text compression. The proposed Bengali text compres...
متن کاملDesign and Analysis of an Effective Corpus for Evaluation of Bengali Text Compression Schemes
In this paper, we propose an effective platform for evaluation of Bengali text compression schemes. A novel scheme for construction of Bengali text compression corpus has also been incorporated in this paper. A methodical study on the formulation-approaches of text corpus for data compression and present an effective corpus named Ekushe-Khul for evaluating the Bengali text compression schemes h...
متن کامل